Sparsity in Linear Predictive Coding of Speech

نویسنده

  • Daniele Giacobello
چکیده

This thesis deals with developing improved techniques for speech coding based on the recent developments in sparse signal representation. In particular, this work is motivated by the need to address some of the limitations of the wellknown linear prediction (LP) model currently applied in many modern speech coders. In the first part of the thesis, we provide an overview of Sparse Linear Prediction, a set of speech processing tools created by introducing sparsity constraints into the LP framework. This approach defines predictors that look for a sparse residual rather than a minimum variance one with direct applications to coding but also consistent with the speech production model of voiced speech, where the excitation of the all-pole filter can be modeled as an impulse train, i.e., a sparse sequence. Introducing sparsity in the LP framework will also bring to develop the concept of high-order sparse predictors. These predictors, by modeling efficiently the spectral envelope and the harmonics components with very few coefficients, have direct applications in speech processing, engendering a joint estimation of short-term and long-term predictors. We also give preliminary results of the effectiveness of their application in audio processing. The second part of the thesis deals with introducing sparsity directly in the linear prediction analysis-by-synthesis (LPAS) speech coding paradigm. We first propose a novel near-optimal method to look for a sparse approximate excitation using a compressed sensing formulation. Furthermore, we define a novel re-estimation procedure to adapt the predictor coefficients to the given sparse excitation, balancing the two representations in the context of speech coding. Finally, the advantages of the compact parametric representation of a segment of speech, given by the sparse linear predictors and the use of the reestimation procedure, are analyzed in the context of frame independent coding for speech communications over packet networks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Enhancement using Adaptive Data-Based Dictionary Learning

In this paper, a speech enhancement method based on sparse representation of data frames has been presented. Speech enhancement is one of the most applicable areas in different signal processing fields. The objective of a speech enhancement system is improvement of either intelligibility or quality of the speech signals. This process is carried out using the speech signal processing techniques ...

متن کامل

Fast algorithms for high-order sparse linear prediction with applications to speech processing

In speech processing applications, imposing sparsity constraints on high-order linear prediction coefficients and prediction residuals has proven successful in overcoming some of the limitation of conventional linear predictive modeling. However, this modeling scheme, named sparse linear prediction, is generally formulated as a linear programming problem that comes at the expenses of a much hig...

متن کامل

Fully vector-quantized neural network-based code-excited nonlinear predictive speech coding

I Recent studies have shown that non-linear prediction can be implemented with neural networks, and non-linear predictors will on average achieve about 2 3 improvement in prediction gain over conventional linear predictors. In this paper, we take the advantage of non-linear prediction with neural network, apply it to predictive speech coding and attempt to improve the speech coding performance....

متن کامل

Speech Compression Using Linear Predictive Coding(lpc)

One of the most powerful speech analysis techniques is the method of linear predictive analysis. This method has become the predominant technique for representing speech for low bit rate transmission or storage. The importance of this method lies both in its ability to provide extremely accurate estimates of the speech parameters and in its relative speed of computation. The basic idea behind l...

متن کامل

Linear Predictive Speech Coding Using Fermat Number Transform

This paper is about the reduction of the computational complexity of a speech codec. A Linear Predictive Coding procedure is developed to allow its implementation with Number Theoretic Transforms. The use of Fermat Number Transform can reduce, in a significant way, the cost of Linear Predictive algorithm implantation on Digital Signal Processor.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010